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Abstract. We derive a Belief-Propagation algorithm for counting large loops in a 
directed network. We evaluate the distribution of the number of small loops in a 
directed random network with given degree sequence. We apply the algorithm to a 
few characteristic directed networks of various network sizes and loop structures and 
compare the algorithm with exhaustive counting results when possible. The algorithm 
is adequate in estimating loop counts for large directed networks and can be used to 
compare the loop structure of directed networks and their randomized counterparts. 
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1. Introduction 

The structure of complex networks highly affect the critical behavior of different 
cooperative models [f] and the nonlinear dynamical process that take place on the 
network [2]. 

In particular both the directionality of the links which suggest a non symmetric 
interaction [3, 4, 5] and the local loop structure [6] of the network which correlates 
neighboring nodes has important dynamical consequences. In fact directionality of links 
becomes particularly important when a transport process of mass or information takes 
place in the network [3] and the loop structure in these directed networks are crucial for 
assessing the networks' robustness characteristics and determining the load distribution. 

Directed networks are ubiquitous in both man-made and natural systems. Some 
examples of directed networks are the Texas power-grid, the World- Wide- Web, the 
foodwebs and in biological networks, such as the metabolic network, the transcription 
network and the neural network. The local structure of directed network is radically 
different from the structure of their undirected version [7] While many undirected 
networks are characterized but large clustering coefficient [8] and large number of short 
loops [9, 10] this is not a general trend for directed networks. For example the C.elegans 
neural network has a over-representation of short loops compared to a randomized 
network if the direction of the links is not considered while it has an under-representation 
of the number of loops when the direction of the links is taken into account [7] . 

Nevertheless, while counting small loops is a given network is a relatively easy 
computation, counting large loops in a real world network is a very hard task. In fact 
the number of large loops can, and usually does grow exponentially with the number of 
nodes N in the network. The known efficient exhaustive algorithms [11, 12] for counting 
loops still have a time bound of 0(N * M * (L+ 1)) where N, M, L are respectively the 
number of nodes, links and loops in the network. This task becomes computationally 
inapplicable for counting large loops in many real networks. Two different approaches 
for the study of long loops have been proposed: devising MonteCarlo algorithms, or 
using Belief-Propagation (BP) algorithms. The two approaches have both been pursued 
in the case of undirected networks [13, 14, 15]. The BP algorithm [14] is a heuristic 
algorithm which does not have sampling bias as the MonteCarlo algorithm [13] does and 
is observed to give good results as the size of the network increases. 

In this paper we generalize the BP algorithm proposed by [14, 15] to directed 
networks. We analytically derive the outcome of the algorithm in an ensemble of random 
uncorrelated networks with given degree sequence of in/out degrees in agreement with 
the prediction for the average number of nodes in this ensemble [7]. We finally study 
the particular limitations of the algorithm for small network sizes and small number of 
loops in the graph. The paper is divided into four further sections. In Section 2, we 
derive the BP algorithm for directed networks following the similar steps as described in 
[15]. In Section 3, we derive the distribution of the small loops in uncorrelated random 
ensembles. In Sections 4 and 5, we describe the steps in the algorithm and its application 
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to a few characteristic directed networks. 



2. Derivation of the BP algorithm 

Given a network G of N nodes and M links, we define a partition function Z(u) as the 
generating functions of the number Ml of loops of length L in the network, 

Z{u) = Y J u L M L {G). (1) 

L 

Starting with this partition function, we can define a free energy f(u) and an entropy 
a(£) of the loops of length L = N£a(£) as the following: 

/(«) = ^lnZ(«) 

a(£) = ^\nAf L=m . (2) 

For each directed link in the network, / = (ij) from node % to node j, if we define a 
variable Si — 0,1 which indicates if a given loop passes through the link /, the partition 
function Z{u) can then be written as 

Z(u) = 5>({Si})«l£i fl ', (3) 

{Si} 

where w{{S{\) is an indicator function of the loops, i.e. it is 1 if the variables Si — 1 have 
a support which form a closed loop, and it is zero otherwise. As in References [14, 15] 
we take for simplicity a relaxed local form of the indicator function w({Si}) which is 
1 also if the assignment of the link variables Si is compatible with a few disconnected 
loops. In particular we take w({Si}) as 

N 

w({S l }) = UM{Sh) (4) 
1=1 

where {S}i = {S^}j e gi, and di indicates the set of nodes either pointing to i or pointed 
by i and where Wi({S}i) is defined as 

{1 if Ejea+i S{ij) = 1 and Ejea_j >%?} = 1 
1 if Ejea+i S{ij) = and Ej e a_i %> = 
otherwise 
with d + i and d_i indicating the set of nodes j which points to % or which are pointed by 
i, respectively. Finding the free energy f(u) associated with the partition function (3) 
can be cast into finding normalized distributions p v ({Si}) which minimize the Kullback 
distance 

FgmsIPv] = E PvdSt}) ^ ( r P r j\ S , l y s ) ■ (5) 
{ s t} KwdS^u^^J 

In fact it is straightforward to show that F Gibbs assumes its minimal value when 
Pv({Si}) = w({Si})u^' Sl jZ . If the given network is a tree, the trial distribution p v (S_) 
takes the form 

P({5,}) = (UPi(Si)) 1 (UPi(Si)) ■ (6) 
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Pi(St) = E p(W) 

{S|}\S| 

Pi({S}i)= E P({5}). (7) 

In a real case, when the network is not a tree, we can always take a variational approach 
and try a given trial distribution of the form (6). After taking this variational approach, 
we then have to minimize the Bethe free energy F Be the as 

F Bethe [{ Pt },{ Pl }} = Y, E M{sh)in(^^)-^T,pi(Si)Hpi(s l y 

i {S}i\Si \ w i\\d\i)J I Sl 

For each link starting from % and ending in j, there are the constraints 

pi(Si)= T,M{sh) 

pi(Si)= E^U (9) 

Introducing the Lagrangian multipliers enforcing the conditions (9) and the 
normalization of the probabilities it is easy to show that a possible parametrization 
of the marginals is the following, 

Msh) = ^w t ({s} t ) n (mi, ,,• )•-••- n ("//' ,) s ■ (io) 

For every directed link (ij) from node % to node j the values of the messages y^j and 
yj^i are fixed by the constraints in Eq. (9) to satisfy the following BP equations: 

— 

1 + u 2 J2k'ed+(i)\j Vk'^i Efce9_j Uk^i 

^ = (11) 

1 + u l^k'ed-j\i Uk'^j l^k£d + i Uk^j 

The normalization constants for the marginals is consequently given by 
Q = 1 + uy^yj^i. 

Ci = l+u 2 E J/k'-»* E 2/k-i- ( 12 ) 

k'ed-i k£d + i 

The Bethe free energy density /set^e = jjF Bethe becomes 

M AT 

iv/Betfc («) = - E ln Cl + E m Ci- ( 13 ) 

i=l i=l 

For any given value of w the loops length is given by 

e( u ) = 1 v P ,(i) = 1 v ^fej , (14) 
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The function £{u) can be inverted giving the function u(£) and finally proving an 
expression for the entropy of the loops in the graph under a Bethe variational approach, 

o- Bethe (£)=f(u(£))-£\nu(£). (15) 

3. Derivation of the typical number of short loops in random directed 
network with given degree sequence 

We consider an ensemble of random directed networks with given degree sequence 
{/c- n , k° ut } Vi = l,...,N. If the maximal in/out connectivities K tn /K out of the network 
satisfy the inequality K m K out < ({k in ))N, the network is uncorrelated. By q kin ,k out we 
indicated the degree distribution of the ensemble. In Ref. [7] an expression for the 
average number Ml of small loops was given, 

W^\{^^X (16) 



1 in 



valid as long as 

I <g- (k in k out ) 2 

{(kinkout) 2 ) 

Is an interesting exercise to see what is the distribution of the number of small loops 

in the ensemble of directed networks by solving the BP equation for a random directed 

ensemble in parallel with the distribution found in the undirected case [15]. In a directed 

network ensemble the BP messages y and y along each link are equally distributed 

depending only on the value of u. Given the BP equations (11), the distribution P(y; u) 

of the field y has to satisfy the self-consistent equation 

00 k 00 00 k 

P(y,u)= -nr^<lo,k ou Ay) + 2 T^T^kout 

kout 1 ' out I \ out/ 

coo 

dyiP(yi, u) . . . , dy kin P(y kin ; u) 



Jo 

roo 

Jo 



dyiP(yi] u) . . . dy kout P(y kout ; u)5(y - g k ({y}, {y})) (18) 



with 



9i = uyi 

9k = ^ ~ ^ f° r k >2. (19) 

1 + U Z2 k 'ed + (i)\j Uk'^i 22 k £d_i Vk^i 

In fact, given a random edge the probability that its starting node % has connectivity 
(kouuhn) is given by jj^q kin , kout - The fields y have to satisfy a similar recursive 
equation, i.e. 

00 k- 00 00 k- 

P(y;u)= 7TrT9fcin,oW+ Yl lf\<lk in ,k out 

k — 1 v^in) k- — 1 k t — 1 \™in) 

roo 

/ dy x P{y x \ u) . . . du kin y kin P(y kin ; u) 
Jo 

dyiP(yi, u) . . . du kout y kout P{y kout+ ; u)5{y - g k {{y}, {y})\20) 



o 
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with 

9i = uyx 

h = z j= u ^o +: Vk^ for k > 2 (21) 

For a given small value of u = u m + e, the two coupled equations in Eq. (18) and Eq. (20) 
become independent. By proceeding as in [15], we find that the number of small loops 
in the ensemble is given by 

^K(Wf (22) 

with Poisson fluctuations for loops of size L <C log(iV). For larger loop sizes up to the 
boundary limit given by (17), the average number of loops in the ensemble is still given 
by (22) but with significant fluctuations in the number of loops. 

4. The BP algorithm 

The study of the partition function Eq. (3) carried on in Section 2 is such that a 
new algorithm for counting large loops in a directed network can be formulated. In 
particular, given a network with N nodes and M links, the algorithm is: 

• Initialize the messages y^j, y^i for every directed link between i and j to random 
values. 

• For every value of u, iterate the BP equations in Eq. (11) 



1 + U 2 J2k'£d+(i)\j Vk'^i Y.k&d-i Vk~ 



i -t- u 2^k'&d-j\i Vk'->j l^ked+i Vk^j 

until convergence. 

• Calculate £(u) and f(u) from Eqn's (14) and (13) which we recall here for 



convenience 



= 4 X>(1) = T-r E UVi ^ j T . (24) 



+ uy^yj 



M 



Nf B ethe(u) = - In (1 + Uyi^ji/j^i) 



1=1 

+ ^ln(l+ M 2 ]T y kl ^ VkJ\- (25) 

i=l \ k'ed-i ked+i J 

• Evaluate a(£) by Eq. (26) which again we repeat here for convenience 

a Bethe (£(u)) = f(u)-£(u)\nu. (26) 
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Figure 1. Entropy a(L/N) of the loops of length L for the real Chesapeake food- web 
(solid line) and the entropy of the loops counted by exact enumeration (diamods). 

5. Application of the algorithm to real directed networks 

We applied the formulated algorithm to a large set of directed networks [17]. For 
some of these networks we calculated the number of loops Ml of lenght L directly by 
exact enumeration [12]. We then compare the entropy of the loops a(£) find by the 
BP algorithm with the entropy of the loops (Jq{£) find by directed enumeration of the 
number of loops 

^W = ^ln(ATS ct ) (27) 

We note that for the foodweb with small number of nodes the algorithm does not 
provide a good approximation for the number of loops present in the graph. A dramatic 
example is the Chesapeake foodweb. In this case we were able to count all the loops 
in the network exhaustively since the network contains very few loops. In this case the 
BP algorithm since the loops are few the BP algorithm highly overestimates the largest 
loop in the network (See Figure 1). In fact it predict a largest loop of lend L max = 12 
where the largest loop is of length L max = 7. This effect is observed to be present also 
in the undirected BP algorithm [14]. 

The discrepancy is predicted to be strong only in cases where the size of the network 
is small and the number of loops in the network is small just as in the Chesapeake 
case. When the network has a larger number of loops and the entropy of the loops is 
larger, much better results are expected. In the case of the C. elegans neural network 
(N = 306) the entropy for small number of loops is overlapping with the results of exact 
enumeration as it can clearly be seen in Figure 2. We further compare the results of the 
algorithm on a given network and on randomized network ensemble. A typical example 
is the metabolic network of E. coli [17] in which we could compare the entropy provided 
by the BP algorithm with the entropy of a series of 100 random network with the same 
degree distribution. 
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Figure 2. Entropy a(L/N) of the loops of length L for the real C.elegans neural 
network (solid line) and the entropy of the loops counted by exact enumeration for 
small loops (small diamods). 



6. Conclusions 

In conclusion we provide a new algorithm for counting large loops in directed network. 
The algorithm is predicted to give good results only for large networks size N. In 
the paper we demonstrate cases in which it fails to predict the right entropy and loop 
structure due to the small size of the network. We propose to study the significance 
of loops structure in large networks by comparing the results of the algorithm on real 
networks and randomized networks when networks are large an the number of loops in 
the network are also large. 
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Figure 3. Entropy a(L/N) of the loops of length L for the real metabolic network and 
average entropy of the loops in the randomized network ensemble with same degree 
sequence. 
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